Towards Normalizing the Edit Distance Using a Genetic Algorithms-Based Scheme
نویسنده
چکیده
The normalized edit distance is one of the distances derived from the edit distance. It is useful in some applications because it takes into account the lengths of the two strings compared. The normalized edit distance is not defined in terms of edit operations but rather in terms of the edit path. In this paper we propose a new derivative of the edit distance that also takes into consideration the lengths of the two strings, but the new distance is related directly to the edit distance. The particularity of the new distance is that it uses the genetic algorithms to set the values of the parameters it uses. We conduct experiments to test the new distance and we obtain promising results.
منابع مشابه
Adaptive Approximate Record Matching
Typographical data entry errors and incomplete documents, produce imperfect records in real world databases. These errors generate distinct records which belong to the same entity. The aim of Approximate Record Matching is to find multiple records which belong to an entity. In this paper, an algorithm for Approximate Record Matching is proposed that can be adapted automatically with input error...
متن کاملProfiling the Distance Characteristics of Mutation Operators for Permutation-Based Genetic Algorithms
In this paper, we consider the permutation representation of genetic algorithms, and more generally, local search algorithms. We use a variety of permutation distance measures to profile the behavior of the most commonly used mutation operators for permutation-based genetic algorithms. Our operator profiles are also applicable to other local search algorithms, such as simulated annealing, as th...
متن کاملCoverage Improvement In Wireless Sensor Networks Based On Fuzzy-Logic And Genetic Algorithm
Wireless sensor networks have been widely considered as one of the most important 21th century technologies and are used in so many applications such as environmental monitoring, security and surveillance. Wireless sensor networks are used when it is not possible or convenient to supply signaling or power supply wires to a wireless sensor node. The wireless sensor node must be battery powered.C...
متن کاملPhonologically Informed Edit Distance Algorithms for Word Alignment with Low-Resource Languages
Edit distance is commonly used to relate cognates across languages. This technique is particularly relevant for the processing of lowresource languages because the sparse data from such a language can be significantly bolstered by connecting words in the lowresource language with cognates in a related, higher-resource language. We present three methods for weighting edit distance algorithms bas...
متن کاملPractical Methods for Approximate String Matching
Given a pattern string and a text, the task of approximate string matching is to find all locations in the text that are similar to the pattern. This type of search may be done for example in applications of spelling error correction or bioinformatics. Typically edit distance is used as the measure of similarity (or distance) between two strings. In this thesis we concentrate on unit-cost edit ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012